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We have identified a Y-chromosomal lineage with several unusual features. It was found in 16 populations throughout 
a large region of Asia, stretching from the Pacific to the Caspian Sea, and was present at high frequency: ~8% of 
the men in this region carry it, and it thus makes up ~0.5% of the world total. The pattern of variation within 
the lineage suggested that it originated in Mongolia ~1,000 years ago. Such a rapid spread cannot have occurred 
by chance; it must have been a result of selection. The lineage is carried by likely male-line descendants of Genghis 
Khan, and we therefore propose that it has spread by a novel form of social selection resulting from their behavior. 


The patterns of variation found in human DNA are usu- 
ally considered to result from a balance between neutral 
processes and natural selection. Among the former, mu- 
tation, recombination, and migration increase variation, 
whereas genetic drift decreases it. Natural selection can 
act to remove deleterious variants (purifying selection), 
maintain polymorphism (balancing selection), or pro- 
duce a trend (directional selection). Clear examples of 
the latter are rare in humans, but probable cases, such 
as those associated with resistance to malaria (Hamblin 
and Di Rienzo 2000) or unidentified pathogens (Ste- 
phens et al. 1998), can be recognized by the “signature” 
they leave in the genome. The rapid increase in frequency 
of the selected allele and its linked sequences results in 
a haplotype that is found at higher frequency than would 
be expected from its degree of variation. We have now 
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identified such a haplotype on the Y chromosome, but 
we suggest that its spread results not from a biological 
advantage, but from human activities recorded in history. 

In surveys of DNA variation in Asia, we typed 2,123 
men with =32 markers to produce a Y haplotype for 
each man; these included 1,126 individuals described 
elsewhere (Qamar et al. 2002; Zerjal et al. 2002). Over 
90% of the haplotypes showed the usual pattern (Moh- 
yuddin et al. 2001): most males had a unique code; and 
the few haplotypes present in more than one individual 
were generally found within the same population. How- 
ever, we also saw one pattern that was novel in two 
respects. First, there was a high frequency of a cluster 
of closely related lineages, collectively called the “star 
cluster” (fig. 1, shaded area). Second, star-cluster chro- 
mosomes were found in 16 populations throughout a 
large geographical area extending from Central Asia to 
the Pacific (fig. 2); thus, they do not result from an event 
specific to any single population. We can deduce the 
most likely time to the most recent common ancestor 
(TMRCA) and place of origin of this unusual lineage 
from the observed genetic variation. To do this, it is first 
necessary to distinguish star-cluster chromosomes from 
the remainder. For this, we used the criterion that hap- 
lotypes linked to the central one in the shaded area of 
the network without gaps would be included (fig. 1). 
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Figure 1 Median-joining network (Bandelt et al. 1999) representing Y-chromosomal variation within haplogroup C*(xC3c). Chromosomes 
were typed with a minimum of 16 binary markers (Qamar et al. 2002; Zerjal et al. 2002; our unpublished observations), including RPS4Y and 
M48, to define the lineage C*(xC3c) (Y-Chromosome-Consortium 2002), also known as haplogroup 10, derived for RPS4Y and ancestral for 
M48. Sixteen Y microsatellites were also typed, but DYS19 was excluded from the network analysis because it is duplicated in haplogroup C. 
The central star-cluster profile is 10-16-25-10-11-13-14-12-11-11-11-12-8-10-10, for the loci DYS389I-DYS389b-DYS390-DYS391-DYS392- 
DYS393-DYS388-DYS425-DYS426-DYS434-DYS435-DYS436-D YS437-D YS438-DYS439. Circles represent lineages, area is proportional to 
frequency, and color indicates population of origin. Lines represent microsatellite mutational differences. 
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Geographical distribution of star-cluster chromosomes. Populations are shown as circles with an area proportional to sample 


size; star-cluster chromosomes are indicated by green sectors. The shaded area represents the extent of Genghis Khan’s empire at the time of 


his death (Morgan 1986). 


We then used two approaches to calculate a TMRCA 
for the star-cluster chromosomes. The program BAT- 
WING (Wilson and Balding 1998) uses models of both 
mutation and population processes, which were specified 
as described elsewhere (Qamar et al. 2002). With this 
program, we estimated ~1,000 years for the TMRCA 
(95% confidence interval limits ~700-1,300 years). The 
use of alternative demographic models with constant or 
exponentially increasing population size changed the es- 
timate by <10%. A method that does not consider pop- 
ulation structure (Morral et al. 1994), p, suggested ~860 
(~590-1,300) years. In both calculations, we assumed 
a generation time of 30 years. The origin was most likely 
in Mongolia, where the largest number of different star- 
cluster haplotypes is found (fig. 1). Thus, a single male 
line, probably originating in Mongolia, has spread in the 
last ~1,000 years to represent ~8% of the males in a 
region stretching from northeast China to Uzbekistan. 
If this spread were due to a general population expan- 
sion, we would expect to find multiple lineages with the 
same characteristics of high frequency and presence in 
multiple populations, but we do not (Zerjal et al. 2002). 
The star-cluster pattern is unique. 

This rise in frequency, if spread evenly over ~34 gen- 
erations, would require an average increase by a factor 
of ~1.36 per generation and is thus comparable to the 
most extreme selective events observed in natural pop- 
ulations, such as the spread of melanic moths in 19th- 
century England in response to industrial pollution 


(Edleston 1865). We evaluated whether it could have 
occurred by chance. If the population growth rate is 
known, it is possible to test whether the observed fre- 
quency of a lineage is consistent with its level of vari- 
ation, assuming neutrality (Slatkin and Bertorelle 2001). 
Using this method, we estimated the chance of finding 
the low degree of variation observed in the star cluster, 
with a current frequency of ~8%, under neutral condi- 
tions. Even with the demographic model most likely to 
lead to rapid increase of the lineage, double exponential 
growth, the probability was <10 -*’’; if the mutation rate 
were 10 times lower, the probability would still be<10°'°. 
Thus, chance can be excluded: selection must have acted 
on this haplotype. 

Could biological selection be responsible? Although 
this possibility cannot be entirely ruled out, the small 
number of genes on the Y chromosome and their spe- 
cialized functions provide few opportunities for selection 
(Jobling and Tyler-Smith 2000). It is therefore necessary 
to look for alternative explanations. Increased repro- 
ductive fitness, transmitted socially from generation to 
generation, of males carrying the same Y chromosome 
would lead to the increase in frequency of their Y lineage, 
and this effect would be enhanced by the elimination of 
unrelated males. Within the last 1,000 years in this part 
of the world, these conditions are met by Genghis (Chin- 
gis) Khan (c. 1162-1227) and his male relatives. He es- 
tablished the largest land empire in history and often 
slaughtered the conquered populations, and he and his 
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close male relatives had many children. Although the 
Mongol empire soon disintegrated as a political unit, his 
male-line descendants ruled large areas of Asia for many 
generations. These included China, where the Yiian Dy- 
nasty emperors remained in power until 1368, after 
which the Mongols continued to dominate the country 
north of the Great Wall for several more centuries, and 
the region west to the Aral Sea, where the Chaghatai 
Khans ruled. Although their power diminished over 
time, they remained at Kashghar near the Kyrgyzstan/ 
China border until the middle of the 17th century (Mor- 
gan 1986). 

It is striking that the boundary of the Mongol empire 
when Genghis Khan died (fig. 2), which also corresponds 
to the boundaries of the regions controlled by later Khans, 
matches the distribution of star-cluster chromosomes 
closely, with one exception: the Hazaras. We, therefore, 
wished to compare Genghis Khan’s Y profile with the 
star cluster. It is not possible to examine his remains 
directly, but history provides an alternative. The Hazaras 
of Pakistan have a Mongol origin (Qamar et al. 2002), 
and many consider themselves to be direct male-line de- 
scendants of Genghis Khan. A genealogy documenting 
these links has been constructed from their oral history 
(Mousavi 1998). A large proportion of the Hazara pro- 
files do indeed lie in the star cluster, which is not oth- 
erwise seen in Pakistan (fig. 2), thus supporting their 
oral tradition and suggesting that Genghis Khan carried 
the star-cluster haplotype. 

The Y chromosome of a single individual has spread 
rapidly and is now found in ~8% of the males through- 
out a large part of Asia. Indeed, if our sample is rep- 
resentative, this chromosome will be present in about 
16 million men, ~0.5% of the world’s total. The avail- 
able evidence suggests that it was carried by Genghis 
Khan. His Y chromosome would obviously have had 
ancestors, and our best estimate of the TMRCA of star- 
cluster chromosomes lies several generations before his 
birth. Several scenarios, which are not mutually exclu- 
sive, could explain its rapid spread: (1) all populations 
carrying star-cluster chromosomes could have descended 
from a common ancestral population in which it was 
present at high frequency; (2) many or most Mongols 
at the time of the Mongol empire could have carried 
these chromosomes; (3) it could have been restricted to 
Genghis Khan and his close male-line relatives, and this 
specific lineage could have spread as a result of their 
activities. Explanation 1 is unlikely because these pop- 
ulations do not share other Y haplotypes, and expla- 
nation 2 is difficult to reconcile with the high Y-haplo- 
type diversity of modern Mongolians (Zerjal et al. 2002). 
The historically documented events accompanying the 
establishment of the Mongol empire would have con- 
tributed directly to the spread of this lineage by Genghis 
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Khan and his relatives, but perhaps as important was the 
establishment of a long-lasting male dynasty. This sce- 
nario shows selection acting on a group of related men; 
group selection has been much discussed (Wilson and 
Sober 1994) and is distinguished by the property that 
the increased fitness of the group is not reducible to the 
increased fitness of the individuals. It is unclear whether 
this is the case here. Our findings nevertheless demon- 
strate a novel form of selection in human populations 
on the basis of social prestige. A founder effect of this 
magnitude will have influenced allele frequencies else- 
where in the genome: mitochondrial DNA lineages will 
not be affected, since males do not transmit their mi- 
tochondrial DNA, but, in the simplest models, the foun- 
der male will have been the ancestor of each autosomal 
sequence in ~4% of the population and X-chromosomal 
sequence in ~2.7%, with implications for the medical 
genetics of the region. Large-scale changes to patterns 
of human genetic variation can occur very quickly. Al- 
though local influences of this kind may have been com- 
mon in human populations, it is, perhaps, fortunate that 
events of this magnitude have been rare. 
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